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ABSTRACT 

The friends-of-friends algorithm (hereafter, FOF) is a percolation algorithm which is routinely used to iden- 
tify dark matter halos from N-body simulations. We use results from percolation theory to show that the 
boundary of FOF halos does not correspond to a single density threshold but to a range of densities close to a 
critical value that depends upon the linking length parameter, b. We show that for the commonly used choice of 
b = 0.2, this critical density is equal to 8 1 .62 times the mean matter density. Consequently, halos identified by 
the FOF algorithm enclose an average overdensity which depends on their density profile (concentration) and 
therefore changes with halo mass contrary to the popular belief that the average overdensity is ~180. We derive 
an analytical expression for the overdensity as a function of the linking length parameter b and the concentra- 
tion of the halo. Results of tests carried out using simulated and actual FOF halos identified in cosmological 
simulations show excellent agreement with our analytical prediction. We also find that the mass of the halo 
that the FOF algorithm selects crucially depends upon mass resolution. We find a percolation theory motivated 
formula that is able to accurately correct for the dependence on number of particles for the mock realizations 
of spherical and triaxial Navarro-Frenk- White halos. However, we show that this correction breaks down when 
applied to the real cosmological FOF halos due to presence of substructures. Given that abundance of substruc- 
ture depends on redshift and cosmology, we expect that the resolution effects due to substructure on the FOF 
mass and halo mass function will also depend on redshift and cosmology and will be difficult to correct for in 
general. Finally, we discuss the implications of our results for the universality of the mass function. 
Subject headings: cosmology: theory - halos: formation - methods: numerical 



1. INTRODUCTION 

Over the last three decades, cosmological simulations have 
been playing an ever increasing role in testing cosmological 
structure formation models against observations using statis- 
tics that can be reliably measured in both. Given that most 
of the available observational information is about virialized 
peaks in the overall matter distribution, identification of cor- 
responding virialized peaks, or halos, in simulations is of crit- 
ical importance. 

A number of automated halo finding algorithms h ave been 
developed over the years (e.g., iKnebe et al.ll201 ll and ref- 
erences therein). One of the most popular of these is 
the "Friends-Of-Friends" (hereafter, FOF) algorithm which 
uniquely defines groups that contain all particles separated 
by distance less than a given linking length, bl, where I is 
the mean interparticle separation in simulations and b is a 
free parameter of the algorithm. The FOF algorithm is com- 
monly applied b oth to identify groups of galaxies i n red- 
shift ca talogs (Huchra & Geller 1982; Press & Davis 1982; 
lEinasto et al. 1984; Eke et al. 2004; BerHnd et al. 2006) and 
virial i zed halos in cosm o logical simu lations (Einasto et al 
l^l 'Davis et al.' IgSSl 'Frenk et al.' ri988>: ,Lacev & CoL 



An attractive feature of the FOF algorithm is its sim- 
plicity: the result depends solely on the linking length in 
units of the mean interparticle separation, b. The FOF 
algorithm does not assume any particular halo shape and 
can therefore better match the generally triaxial mass dis- 
tribution in halos forming in hierarchical structure forma- 
tion models. In addition, studies over the last decade 
indicate that the appropriately parameterized mass func- 
tion of FOF halos is universal for different redshifts and 
cosmologies at least to ~ 10%, although real systematic 
variations of < 10% d o exist (Jenkins et al. 2001; Whita 
l2002t lEvrard et al.ll2 0q2t THu & Kra vtsov 200 3; .WaiTen et all 
2006; Reed et al. 20 071; iLukic etaLl 2007; Ti nker et al.ll2008l 



1994; Klvpinetal. 1999; Jenkins et alj l2001t IWarren et al.l 
2006tlGottlober & Yepesii2007l) . 
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Bhatta charva et al.i i2010l; ICrocce et al.i j2010l: ICourtin et akl 
2010). Mass function of halos identified using the spheri- 
cal overdensity (SO) algorithm, on the other hand, exhibits 
considerably larger diff erences for differen t cosmologies and 
redshifts (White 200l [Tinker et al.l l2008h . Given the im- 
portance of the halo mass function in interpreting observed 
counts of galaxies and clusters, it is interesting to understand 
the origin of deviations from universality, the role of mass 
definition, and differences between mass functions defined 
with the FOF and SO halo finders (e 
Jenkins et al.l 12001 Iwiitg |200TI 1200: 
Lukic et al. 2009^7 This, in turn, requires good understand- 
ing of properties of the FO F-identified groups. For example, 
a recent study bv ICourtilT et al. (201(3) shows that the degree 
of universality depends sensitively on the choice of the linking 
length parameter b. 

One could expect that for a given value of b, the FOF al- 
gorithm defines the boundary of a halo as corresponding to a 
certain isode nsity surface , at lea st in the limit of large number 
of particles. iFrenk et al] (119881) indicate that the overdensity 
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(defined with respect to the mean density of t he universe: 6 - 
p/p - 1 ) of this surface is (5 fof ^ Ib"^. Lacev & Cold (1 1994 see 
also'Summ ers et aO 19951 and I Audit et al.iil998l) quote a value 
four times smaller, of 5fof = 3/(2nb^) ^ QASb'^, correspond- 
ing to the local overdensity of two particles within a sphere of 
radius b. Clearly, such local overdensity is the absolute min- 
imum overdensity that should be sampled by the particles of 
an FOF halo. For the most commonly used value of b = 0.2 
this corresponds to a local overdensity of 6m ~ 60, which for 
an isothermal density profile, p(r) oc r"^, corresponds to an 
enclosed overdensity of 35fof ~ 180. This value is close to the 
virial overdensity predicted by the spherical collapse model in 
the Einstein-De Sitter cosmology and is usually regarded as a 
justification for using b = 0.2 in anal yses of simulations. 

More recentlv. lWarren et al.l(l2006 l have noted that their ex- 
periments on Poisson realizations of isothermal halos indicate 
that the FOF algorithm identifies the boundary at an overden- 
sity Sfof ~ 74, which corresponds to an enclosed overden- 
sity of X! 280 rather than the canonical value of 180. Indeed, 
they report that direct measurements of internal overdensities 
of the FOF halos in their cosmological simulations identified 
with b - 0.2 range from ~ 200 for largest simulation boxes to 
~ 400 for the smallest boxes. Given that small boxes resolve 
predominantly smaller mass halos compared to larger boxes, 
this result hints that the internal overdensity of the FOF halos 
is actually mass dependent. 

Given that the FOF algorithm identifies bou ndary at a lo- 
cal overdensity and halos are described by an iNavarro et al.l 
([l997, hereafter NFW) profile with mass-dependent concen- 
tration, this result is not surprising. However, concentra- 
tion also strongly depends on cosmo logy and redshift (e.g., 
iBuUock et al.]|200ltlZhao"et al.ll2003all^09.) . which immedi- 
ately implies that the internal overdensity of FOF halos iden- 
tified with a given value of b is also redshift and cosmol- 
ogy dependent. Interpretation of the FOF halo mass function 
and other statistics is therefore not trivial. For example. Halo 
Occupation Distribution (HOD) models typically assume that 
halos are defined within a spherical radius enclosing a well- 
defined overdensity. Also, creating mock galaxy catalogs by 
assigning galaxies to FOF halos requires knowledge of the in- 
ternal halo overdensities in order to model the target galaxy 
bias properly. 

In this study we present a detailed analysis of the halo 
boundary and the corresponding overdensity selected by the 
FOF algorithm with a given linking length b based on random 
particle realizations of spherical NFW halos. We also present 
an analytical interpretation of the results of these experiments 
and compare its predictions to overdensities of FOF halos in 
cosmological simulations. We show that the boundary of the 
FOF halos corresponds not to a single local overdensity, but 
to a range of overdensities around a characteristic value that 
can be understood on the basis of percolation theory. For the 
commonly used value of b - 0.2, the characteristic local over- 
density is 6 ~ 81, a value higher than that quoted in previ- 
ous studies. Correspondingly, the enclosed overdensity of the 
FOF halos is considerably higher than thought before and for 
b = 0.2 ranges from ~ 250 to ~ 600 for typical halo concen- 
trations (overdensities for other values of b scale as oc b^^). 

The paper is organized as follows. In §|2]we present tests 
of the FOF algorithm on Monte Carlo realization of idealized 
spherical NFW halos and show explicitly that 1 ) the boundary 
of FOF halos does not correspond to a single local overden- 
sity, but rather to a range of overdensities, 2) the enclosed 
overdensities of the FOF halos are significantly larger than 



commonly thought and depend on concentrations of halos and 
thus on mass, redshift, and cosmology. In § [3] we develop a 
simple analytic model that encapsulates results of the Monte 
Carlo experiments of § |2] (see also the Appendix for inter- 
pretation of these results in the context of percolation theory) 
and present tests of this model against results of cosmological 
simulations. In §|4]we discuss implications of our results for 
the universality of halo mass function. In § |5] we interpret 
results for idealized realizations of NFW halos in the context 
of percolation theory and present an accurate formula describ- 
ing the dependence of the FOF mass on mass resolution based 
on this theory. In § |5] we also consider real halos extracted 
from cosmological simulations of a ACDM cosmology and 
show that substructure present in real halos makes behavior 
of the FOF masses with resolution even more complicated. 
Finally, we summarize our results and conclusions in § |6] In 
the Appendix, we review the basics of the percolation theory 
and demonstrate how the boundary of the FOF halos and their 
mass can be understood and predicted in its context. 

2. TESTS WITH MONTE CARLO REALIZATIONS OF SPHERICAL 
NFW HALOS 

To explore the boundary of the FOF halos a nd their en- 
closed overdensities, we follow the approach of iLukic et alj 
(120091) and consider Monte Carlo realizations of idealized 
spherical halos. We assume that the internal density distri- 
bution of the halos i s described by the NFW density profile 
(INavarro et al.|[T997l) : 



n(r) = 



(r/r,)(l+r/rj 



2 ' 



(1) 



which is a reasonable approximation to density profiles of ha- 
los formed in CDM cosmologies. Here, r, denotes the scale 
radius. The boundary of a halo is usually defined with respect 
to the radius 7?a that encloses internal overdensity A with re- 
spect to the mean density of the universe. The radii r, and 7?a 
are related via the concentration parameter ca - /?a/'"s- The 
normahzation. A, is then given by 



A = 



4JTRI) 



(2) 



where A^a is the number of particles within /?a and the func- 
tion yu(x) is given by 



fi{x) = ln(l -I- x) - 



(l+x) 



(3) 



For the Monte Carlo realizations presented in this section, 
we assume concentration of ca = 10. We generalize our re- 
sults for other concentrations in the following section. We 
generate such realizations with varying number of particles, 
Np, and mean interparticle separation, I. The latter can be ex- 
pressed in terms of the radius /?a and the number of particles 
A^A as 



47tRI a 



3 A^A 



1/3 



(4) 



As the boundary that the FOF algorithm will select is not 
known a priori, we conservatively generate particle distribu- 
tion up to the radius of 2/?a- 

Without loss of generality, we use A - 180, one of the most 
commonly used mass-defining overdensities, and generate a 
series of halo realizations with Ni$o varying from 10^ to 100 
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particles. To reduce Poisson noise, for small A^igo we gener- 
ate multiple realizations and average over them. We use 10, 
100, and 1000 realizations for halos with 10^ 10^ and 100 
particles, respectively. As the particle distribution extends up 
to 27? I go, the actual number of particles used in each of the 
realizations is larger than A^iso roughly by a factor of 1 .4. We 
run the FOF halo finder on each of the halo realizations with 
a linking length equal to 0.2 1. The algorithm links particles 
with each other if they are closer than the linking length. In 
what follows, we restrict our attention to the largest group that 
is found by the FOF algorithm. 

Figures [1121 and [3]show the fraction of particles in a Monte 
Carlo halo, /accept, that are grouped into the central halo by the 
FOF algorithm at a given radius as a function of radius, local 
density, and enclosed overdensity, respectively. Although we 
generate realizations of spherically symmetric halos with no 
physical substructure, the figures show that the boundary of 
the FOF-identified halos is not sharp. The particles joined into 
the FOF group span a range of radii and overdensities. The 
"fuzziness" of the boundary increases dramatically for real- 
izations with the smallest number of particles. Note, however, 
that even for realizations with millions of particles, /accept as a 
function of radius or overdensity does not converge to a step 
function, but rather converges to a well-defined shape span- 
ning a range of radii. This implies that the boundary selected 
by the FOF algorithm is inherently fuzzy. 

Figure |2] also clearly shows that the local overdensity of 
majority of particles within the fuzzy FOF boundary is larger 
than ni8o- Correspondingly, the mean enclosed overdensity 
within this boundary is also much larger than 180, contrary to 
what is usually assumed for b = 0.2 linking length (Fig.[3]l. 

The particles that are joined into an FOF group depend upon 
the percolation properties of the particle distribution. In the 
Appendix, we show that the behavior of /accept as a function 
of radius and overdensity demonstrated by Figures [T]|3]can be 
understood in the framework of percolation theory. For ex- 
ample, percolation theory predicts that for a uniform particle 
distribution percolation (i.e., formation of a group spanning 
the entire region) should occur at the local number density 
equal to a critical value of 



(5) 



This corresponds to the local overdensity (with respect to the 
mean density n = T^) of 



r "crit -, 7 -3 1 

4llt = — - I ^tlcb - 1. 



(6) 



Here is a universal constant that arises in the percolation 
problem of spheres that follow a Poisson distribution. The 
value of this constan t has been calibrated via extensive Monte 
Carlo experiments flLorenz & Ziffll200l1) : 



rir = 0.652960 + 0.000005 . 



(7) 



We can expect that the boundary of FOF halos should ap- 
proximately correspond to ncnt because percolation across a 
radial bin will be inhibited at smaller densities. For our choice 
ofb- 0.2, this corresponds to Wciit = 81.62 1^^, i.e. local over- 
density (5crit = 80.62. This overdensity is shown by the vertical 
line in Figure |2] while vertical lines in Figures [Tl and |3] show 
the corresponding radius and enclosed mean overdensity. The 
figures show that percolation threshold does indeed predict a 
characteristic overdensity and radius roughly in the middle of 
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Fig. 1. — The fraction of particles that are joined into the largest group 
by the FOF algorithm with b = 0.2 as a function of the radius (in units of 
the radius Si go) enclosing the mean overdensity A = 180 for Monte Carlo 
realizations of spherical NFW halos with varying number of particles, Nigo 
(Hnes of different style and color, as indicated in the legend). The vertical 
solid Une marks the radius at which the density equals the critical threshold 
for percolation (eqs.|5]and|6). 
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Fig. 2. — Same as Figure □ but as a function of the local number density 
(calculated analytically using the position of the particle), in units of the local 
number density at Rigo- 

the FOF boundary range. In the Appendix, we show that per- 
colation theory also explains the shape of /accept as a function 
of radius and overdensity for n > Wcnt, and the increase in the 
fuzziness of the boundary with decreasing number of particles 
used. 

For our immediate purposes, however, we can consider the 
empirical results of our Monte Carlo tests for the overdensities 
of the FOF halos. In the next section, we present a simple 
analytic model that describes this overdensity as a function of 
linking length b and halo concentration c. 




Fig. 3. — Same as Figure[T]but as a function of the average enclosed over- 
density, Aenc, nomialized to overdensity of 180. 
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Fig. 4. — The overdensity of halos as a function of the concentration of halos 
selected by the FOF algorithm for three representative values of the linking 
length = 0. 17, 0.20, 0.23 shown by the short-dashed, solid, and long-dashed 
lines, respectively. 

3. CONCENTRATION DEPENDENCE OF THE ENCLOSED FOF 
OVERDENSITY 

3.1. Analytical model 

In the previous section we showed that the boundary of the 
FOF algorithm corresponds to a wide range of local overden- 
sities (with the width of the range dependent on the num- 
ber of particles in a halo) around a characteristic local den- 
sity Went = "c 0^ or the corresponding local overdensity 
5crit = Wcrit/n - 1 = n^h^^ - 1. For the commonly used 
value of the linking length parameter h - 0.2, ^cnt - 80.62. 
Given the characteristic local overdensity at the boundary, it 
is straightforward to derive an analytical expression for the 
average enclosed overdensity assuming that halos have NFW 



density profiles. 

Let us denote the number of particles selected by the FOF 
algorithm as A^a, and the effective spherical radius enclosing 
these particles as /?a, where A is the overdensity of the FOF 
halo which we wish to determine. Evaluating the number den- 
sity at 7? A using equations [T] and |2l and equating it to the crit- 
ical number density, n^^x yields 



4nRl) 



1 



flic a) ca(1 + ca)^ 



njbl) 



-3 



(8) 



Note that here ca = R Alt's is the concentration defined with 
respect to R^- 

The enclosed overdensity. A, of the halo is then given by 



A = 



( 3Na ' 
AnR\l-\ 



- 1 



= 3 rirb 



3yU(CA)(l +caY 



- 1 



(9) 
(10) 



This explicitly shows that the overdensity of an FOF halo de- 
pends not only upon the linking length parameter, b, but also 
upon its concentration. In FigurelH we show the average FOF 
halo overdensity as a function of the concentration, ca, for 
three representative values of b. 

Note that one needs to know the concentration-mass rela- 
tion to predict the overdensity of halos as a function of the 
FOF halo mass. The concentration of halos depends upon the 
radius of the halo (and hence the overdensity definition). The 
concentration and the average overdensity of FOF halos as a 
function of their mass can be calculated using the following 
steps, (i) As a first guess, we assume that FOF halos have a 
certain overdensity (say A,) with respect to the background. 
(ii) We use the concentration-virial mass relation given by 
iZhao et aTl (2009)^ and convert it to a concentration-mass re- 
lation for halos with overdensity A, , (iii) This concentration is 
used to find a new overdensity using Eq. [10] We repeat steps 
(ii) and (iii) until we converge to a value of overdensity (and 
concentration). 

Note that since the concentration of a halo depends on cos- 
mology, redshift, and halo mass, the enclosed overdensity of 
halos selected by the FOF algorithm also depends upon cos- 
mology, redshift, and mass. Furthermore, even for a fixed cos- 
mology, redshift, and mass, halo concentrations exhibit sub- 
stantial scatter and we can therefore expect a corresponding 
scatter in enclosed overdensities. We will consider these de- 
pendencies and scatter in the next section, where we compare 
the predictions of equation [10] to overdensities of FOF halos 
identified in cosmological simulations. 

3.2. Comparison with cosmological simulations 

To test the simple model presented in the previous section, 
we compare predictions of equation[TO]with actual overdensi- 
ties of halos identified in dissipationless cosmological simu- 
lations of the ACDM model. Halos have been identified using 
the FOF algorithm with different linking lengths b and at dif- 
ferent redshifts in two cosmological simulations of the same 
flat ACDM cosmology: the matter and baryon density in units 
of the critical density = 1 - Qa = 0.27 and Qb - 0.0469, 

^ IZhao et al] 120091) calibrate concentration-mass relation for concentra- 
tion and masses defined with respect to the radius enclosing virial overden- 
sity, Avir. 
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Fig. 5. — Enclosed overdensities of the FOF halos identified witli linking 
lengths b = 0.085, 0.17, and 0.20 in the Bolshoi and MultiDark simulations. 
In each panel, the dashed lines show the median overdensity, while the dot- 
ted lines show the 16 and 84 percentiles of the distribution. The blue and 
purple lines correspond to the results of the Bolshoi and MultiDark simula- 
tions, respectively. The grey points show halos from the Bolshoi simulation 
(the MultiDark halos are not shown f or c larity). The red soHd lines show the 
prediction of our model given by eg. llOl and concentration-mass relation of 
|Zhao et al. (2009). The red dotted lines show the rms scatter predicted by the 
model, if we assume a scatter of 0. 14 dex of concentrations at a given mass. 



the Hubble constant /! = //o/(100 kms"' Mpc~') = 0.70, 
the rms ampHtude of Hnear fluctuations in spheres of radius 
8/;"' Mpc erg = 0.82, and the power law slope of the initial 
power spectrum, n, = 0.95. 

The first is the Bolshoi simulation of a cub ic volume of 
size L b = 250 /i 'Mpc, described in detail in iKlvpin et alj 
( 120101) . while the second is the MultiDark simulation of vol- 
ume size Lmd = 1 /i 'Gpc (Prada et al., in preparation)^. 

^ Data from both simulations are pubUcly available at 



Both simulations followed the evolution of 2048^ particles, 
which corresponds to particle masses of 1.36 x 10^ /i^'Mq 
and 8.72 x 10'^ /i 'Mq for the Bolshoi and MultiDark simula- 
tions, respectively. The peak spatial resolution was 1 /i^'kpc 
and 7 /i^'kpc in these simulations, respectively. 

The FOF algorithm used to identify halos in these simula- 
tions is based on the minimal spanning tree and is described 
in Knebe et al. (20 11). Given that the shape of the FOF ha- 
los can be arbitrary and rather complicated, measurement of 
their volume is not trivial. We estimate the volume employing 
the following procedure. For each FOF halo, the three dimen- 
sional distribution of particles is projected onto a two dimen- 
sional plane perpendicular to one of the coordinate axis (e.g., 
the X-axis in the following). A grid of cells of size s = bJis 
then overlaid on this plane. The volume occupied by particles 
in each individual cell / is estimated as 



(11) 



where x^in and Xmax are the minimum and maximum x coor- 
dinates of particles in the cell and x^ax - -^min is the extent of 
the particle distribution along x. The total volume of the halo, 
Vi is calculated as a sum over all cells containing particles 
Vi - £i This procedure is repeated for the other two axes 
and the final halo volume is assumed to be the maximum of 
y,, y,., and K. 

The procedure used for estimating the volume roughly ap- 
proximates the convex hull algorithm.' It is designed to avoid 
the pitfall of estimating volume using 3D grid as a sum of 
cells containing particles. Such estimate leaves many empty 
cells within the halo unaccounted for Moreover, such method 
does not converge to a well-defined volume value as the 3D 
grid cell size is varied. 

Figure |5] shows overdensities of individual FOF halos se- 
lected from simulations as a function of the FOF halo mass 
selected using different linking length parameters. The three 
panels show results for FOF with linking lengths b - 0.085, 
b - O.n and b = 0.2. In each panel, the dashed lines show 
the median overdensity as a function of halo mass, while the 
dotted lines show the 16 and 84 percentiles of the distribution. 
The blue (short-dashed) and purple (long-dashed) lines corre- 
spond to the results of the Bolshoi and MultiDark simulations, 
respectively. The red solid lines show the prediction for the 
overdensity of FOF halos as a function of halo mass given by 
eq.[TO]and concentration-mass relation of IZhao et al.l (12009ft . 
The red dotted lines show the rms scatter predicted by the 
model, if we assume scatter of 0.14 dex of concentrations at 
a given mass, as mea sured in cosmological simulations (e.g., 
IWechsler et aLll2002l) . 

The figure shows that the simple model of equationfTOlcap- 
tures the median overdensities of FOF halos at these differ- 
ent linking lengths rather well. The scatter of overdensities 
in simulated halos is also consistent with the scatter expected 
for the scatter in concentrations. The mass dependence of A 
is qualitatively consistent in the model and simulations, ex- 
cept perhaps at the smallest and largest masses. At small 
masses overdensities of simulated halos exhibit a downturn 
in both the Bolshoi and MultiDark simulations. The masses 
at which the downturn occurs are different in the two simu- 
lations. This downturn is due to the percolation properties of 
halos represented by small particle numbers, as we discuss in 
more detail in § |5] below and in the Appendix. Note, for ex- 

http://www.multidark.org/MultiDark/ . 

' http: //en. wikipedia.org/wiki/ConvexJiull 
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ample, that the downturn shifts to smaller masses for smaller 
values of b (i.e., larger local particle densities at the bound- 
ary) and almost entirely outside the shown mass region for 
b - 0.085. The overdensities of simulated halos also exhibit 
a somewhat weaker trend with mass than predicted by our 
model for masses > 5 x lO'^* /z^'Mpc. It is not clear what 
is the source of this discrepancy, but we note that it is quite 
small and amounts to less than 10%. 

Figure |6] shows overdensities of the FOF halos identified 
with b = 0.17 at redshifts z = 0.0, 1.0, and 2.5. The evo- 
lution of overdensity predicted by the model due to the red- 
sh ift evolution of co ncentrations, predicted using the model 
of IZhao et alj (120091) . matches the redshift trend observed in 
the simulations remarkably well. The scatter of overdensi- 
ties is also well reproduced by the scatter of concentrations at 
all redshifts. Note that enclosed overdensity for this b in the 
mass range probed by the simulations reaches the floor value 
of a; 400 - 450 by z = 2 .5, as virial concentratio n of halos 
reaches a floor of Cvir « 4 (IZhao et al.ll2003Rl2009h . 

4. IMPLICATIONS FOR UNIVERSALITY OF HALO MASS 
FUNCTIONS 

Our results on the enclosed overdensity of the FOF- 
identified halos have important and interesting implications 
for the interpretation of recent results on the universality of 
the halo mass function. A number of studies have found that 
the halo mass function can be expressed in a cosmology and 
redshift independent way as a universal function of the peak 
height, 6c/cr(M), where ddz) is the linearly evolved overden- 
sity of a peak at the time of collapse in the spherical collapse 
model a nd cr(M) is the rms fluctuation of perturbations of 
mass M (Sheth et al."200U I.Tenkins et aLllIOOll iWarren et al.l 
11006; Tinker et al. 2008). 

Although deviations from universal behavior have been 
found for both the FOF and SO identified halos, these devia- 
tions are m arkedly smaller for the FOF mass function s (e.g ., 
Lukic et aljL bOOTt [Tinker et all l2008t ICourtin et al] l2010h . 
Courtin et ai.l (l2010 ) showed that deviations from universality 
are not random but are correlated with the nonlinear virial- 
ization overdensity, Avi,, expected from the spherical collapse 
model for a given cosmology and redshift. In particular, they 
showed that the linking length, buniv, required to minimize de- 
viations of the FOF mass function from universal form for 
a given cosmology and redshift is correlated with the corre- 
sponding Avir as: 



linking length required to identify halos enclosing a certain 
overdensity A is given by (see eq.fTOb 



0.2 



-3 



: 0.241^^1 + 0.68. 



(12) 



This is an interesting and important result, as it indicates 
that deviations from universality can be minimized if one 
takes into account cosmology-dependence of virialization pa- 
rameters properly. However, as noted by iCourtin et al.l(l20I0l) . 
the form of equationfT2lis different from (b/0.2Y^ = Avir/ 178, 
which one would expect if the FOF algorithm with b = 0.2 
would identify halos with a constant internal overdensity of 
K 178. This form thus begs for a physical explanation. Our 
results presented in the previous sections can help explain this 
empirical correlation, at least partially. First, we showed that 
the typical overdensity of FOF halos identified with b = 0.2 
at z = is significantly larger than 178. Second, we showed 
that overdensity of FOF halos depends not only on b but also 
on halo concentrations (eq. [TOl i. and thus on mass, cosmol- 
ogy and redshift. In light of these results we expect that the 



b 

02 



A+ 1 
244.86 



(A(ca), 



where the function i/'(ca) is given by 



He) 



^l(c)(l+c)^ 



(13) 



(14) 



Equation[T3]can thus be used to predict what linking length is 
needed to identify a halo boundary enclosing virial overden- 
sity Avh-. 

Figure [T] shows simulation results of ICourtin et al.l (1201 Oh 
for values of /juniv as a function of Avir (squares with error 
bars) and the best fit to their results (dot-dashed line). It also 
shows the buniv - Avir dependence given by equation[T3] (solid 
blue line). This line is computed assuming Cvir - M relation 
for a flat ACDM cos mology con s istent with WMAP5 results 
given by the model of lZhao et al.l (120091) for the redshift range 
from z > 2 (where Qm ~ 1.0 and Avir ~ 178) to negative 
redshifts into the future to sample low-f2,n, high-Avir regime. 
For all redshifts the model is computed for a fixed halo mass 
Mvir = lO'** Mq, a value representative of the mass range 
probed by Courtin et al.'s simulations. As can be seen from 
the figure , prediction of equa tion \T3\ is much closer to the 
results of ICourtin et al.l (1201 Oh than the commonly assumed 
(b/0.2)~^ = Avir/178. Note that the slope is also different due 
to dependence on concentrations via the function tf/(c). 

This implies that results of Courtin et al. (2010) indeed in- 
dicate that deviations from universality are largely due to the 
use of halo parameters not adjusted for different virialization 
overdensities in different cosmologies and redshifts. Note, 
however, that agreement between our model and their results 
is not perfect. This could be due to several factors. First, 
the virialization overdensities of halos may be somewhat dif- 
ferent from those expected in the spherical collapse model, 
given that most halos form out of triaxial perturbations via a 
complicated sequence of mergers and smooth accretion. Sec- 
ond, the well-known bridging effect of the FOF halo finder 
may play a role at smaller values of Avir (i-e., larger values 
of b). For the commonly used value of b - 0.2 the FOF al- 
gorithm joins «i 10 - 15% of neighboring halos by bridgin; 



at z = (e.g iDavis et a l.lll985 



ICole & Lac^[l9 96: Lukic et al.l[2009l), although fliis fraction 



Bertschinger & Gelb|[T99l 



^_ tigfi 

is likely to be higher at larger redshifts (e.g., ICohn & Whitel 
2008). The figure [T] on the other hand, shows that our model 
predicts that the linking length should increase to b x 0.24 to 
reach Avir. We can expect that bridging will become severe 
for such large linking length and would definitely affect FOF 
halo mass function. The weak dependence of buniv on Avir 
for virial overdensities of * 180 300 may therefore reflect 
the fact that universality of the FOF mass function is compro- 
mised by bridging, which prevents ^univ from reaching lower 
values. 

A dramatic eff ect of bridging on z = 10 halo mass function can be ob- 
served in Figm'e 3 of 'Cohn & Whitel COOS!), which shows abundance of FOF 
halos as a function of FOF mass with b = 0.2 and mass counted around cen- 
ters of the same halos in spheres enclosing overdensity A = 1 80. Although 
the FOF halos for b = 0.2 should have mean overdensities considerably larger 
than 180, and hence FOF mass smaller than SO(180) mass, that figure shows 
that the average FOF mass of halos of a given abundance is actually about 
two times larger than their SO mass with A = 180. 
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Fig. 6. — Overdensities of the FOF halos identified with b = 0.1 7 in the Bolshoi and MultiDark simulations at redshifts z = 0.0, 1.0, and 2.5 along with median 
and scatter (red solid and dotted line) predicted by our model (eg. 110) . The line types and colors ai'e as in Figure|5] 



Some of the discrepancy between equation [13] and Courtin 
et al. simulation results could also be due to the fact that their 
points comprised simulations of different cosmologies all us- 
ing the same power spectrum and normalization erg at z = 0, 
while our prediction is made for a single cosmology as a func- 
tion of redshift. Given that concentrations of halos in a given 
cosmology depend n ot only on Q„„ but also on erg, results of 
ICourtin et alj (120101) for bum - Avk scaling are likely not uni- 
versal. For example, for cosmology with the same and 
Qa but different values of erg, halo concentrations, and hence 
value of /Jiiniv, will be different but Avii- will be the same. 

Incidentally, the dependence of enclosed overdensity of 
FOF halos on concentration could also explain why devi- 
ations of the halo mass function from universality at dif- 
ferent redshifts have been found to be considerably smaller 
for the FOF halos identified with constant b than for the 
SO mass functi on wi t h masses defined using constan t over- 
densitv (.Whita lliol iLukic et alJ l2007t iTinker et alJ I200I 
ICourtin et al.ll2010l) . This more universal behavior could, in 
principle, be an indication that the FOF somehow identifies 
halos better related to the initial density field or assigns mass 
to halos more correctly than the SO algorithm. This would, of 
course, be interesting for understanding the physical origin of 
the universality of the mass function. 

However, given the significant bridging effect for b » 0.2 
discussed above, one should already be skeptical that some 
deep physics underlies a more universal behavior of the b =0.2 
FOF mass functions. In addition, our results imply that 
smaller deviations of the FOF halo mass function from uni- 
versality are also due to a partial cancellation of some of 
the redshift evolution of the halo mass function by redshift 
evolution of halo concentrations. Indeed, for ACDM mod- 
els for which these deviations with redshift have been stud- 
ied, the enclosed overdensities for high-mass FOF halos at 
z = 0, when halo concentrations are relatively high, are 
~ 300 - 400. These overdensities are close to the virial over- 
density of halos in the ACDM cosmology. At higher redshifts, 
however, halo conce ntrations decrease as c{M,z) oc (1 -1- z)"' 
( Bullock et al. 2001) until they reach a floor value of ^ 4 
IZhao et al. 2003a, 2009). For c ~ 4, the overdensity of FOF 
halos should approach ~ 250 (see Fig. |4|i, which is close to 
the virial overdensity at high redshifts where Qm(z) is closer 
to unity. The FOF overdensity thus roughly tracks the virial 



overdensity in the concordance ACDM cosmology. How- 
ever, we stress that this rough tracking is coincidental. This 
is because halo concentrations depend on the halo formation 
times (e.g., Wechsler et al. 2002; Neto et al. 2007; Zhao et aQ 
2009), which in turn depend on power spectrum normaliza- 
tion among other things. Thus, concentrations would still 
evolve with redshift in the Einstein-de Sitter Qm = 1 cos- 
mology, even though virial overdensity would not. The de- 
viations of the FOF mass function from universality would 
therefore also be affected by power spectrum normaUzation, 
or any other parameter that affects concentrations. 

5. MASSES OF FOF HALOS 

5.1. Masses of the idealized FOF halos in the context of 
percolation theory 

Using Monte Carlo simulat ions of isothe rmal halos with 
varying numerical resolution. Warren et al.l (12006 ) were the 
first to demonstrate that the mass of halos selected by the FOF 
algorithm depends upon the resolution with which the halo is 
sampled. They found that at lower resolutions the FOF al- 
gorithm assigns systematically larger masses to halos. They 
devised an empirical formula to correct the effects of such 
syste matic b ias on the halo mass function. More recently, 
Lukic et alj ( 120091) carried out Monte Carlo simulations of 
NFW halos and found a qualitatively similar effect (see also 
iBhattacharya et al. 2 010). They also devised an empirical for- 
mula to correct for the resolution-dependent mass bias for the 
specific case of b = 0.2 and idealized spherical NFW halos 
that they studied. iLukic et aP {2009) showed that this correc- 
tion depends not only on the number of particles but also upon 
the concentration of the halo. 

As can be seen from Figure [T] our experiments also reveal 
a qualitatively similar effect. The boundary identified by the 
FOF algorithm significantly widens with decreasing number 
of halo particles. Therefore, the mass selected by the FOF 
algorithm also increases with decreasing number of particles. 
In Figure |8] we show the mass of the halo identified by FOF 
for each of our spherical Monte Carlo halos normalized by 
Ma, the mass expected within the overdensity predicted by 
using Eq.[TOl We plot this quantity as a function of Lsize given 
by 

_ 2j;A _ 2/3A^A^'^' 
bl bWn^i 



(15) 
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Fig. 7. — Universality of FOF lialo mass function. The linking length parameter, that minimizes deviations of mass functions in different cosmologies from 
universal form. Square points and dot dashed line shows the empirical relation derived by Courtin et al. (2010). The dotted line shows the commonly assumed 
scaling between overden sity and linking length parameter, b. The solid (blue) line shows our analytical prediction assuming the concentration of a lO''* /j^'Mq 
halo (computed using eg. 1131 see text for details.). 



Note that by definition Lsize approximately corresponds to the 
inverse of the fractional accuracy with which a halo bound- 
ary can ever be identified by the FOF algorithm and it de- 
pends upon the resolution of the halo via N^. As described in 
the appendices, Lsize is thus the appropriate parameter to use 
from the standpoint of percolation theory to parameterize the 
dependence of FOF mass for a given halo on the numerical 
resolution. 

Figure|8]shows that FOF mass can be systematically biased 
high by w 10 - 20% for < 10. Most of the modern 
state-of-the-art simulations are in this regime. For example, 
the Bolshoi and MultiDark simulations used in the previous 
section, followed evolution of 2048^ ^ 8.59 x 10^ particles in 
boxes of 250/1"' Mpc and 1000/i"'_Mpc, respectively. For 
b - 0.2, these simulations have bl of k 24.4/1"' kpc and 
X! 97/;"' kpc, respectively. Thus, Lsize < 10 corresponds to 
halos with virial radii /?a ^ 122/;"' kpc and 7?a < 488/;"' kpc, 
respectively, both well within the range of halos resolved by 
these simulations. A wider range of masses would be affected 
for lower resolution simulations. Dependence of Lsize on the 
number of particles in a halo for the choice of b - 0.2 and 
typical halo concentration is presented in Figure [T6]in the Ap- 
pendix, which shows that Lsize ^ 10 for A^a < 10"*. 

In the Appendix, we show that the extra mass identified by 
the FOF algorithm at a given resolution (i.e., a given Lsize) can 
be accurately corrected by the following formula motivated by 



percolation theory: 



1 -1-0.22 a L 



-l/v 



a In Ma 



dp 



(16) 



Here, M'^^ denotes the mass of the halo that FOF would iden- 
tify at infinite resolution, v is a critical exponent from perco- 
lation theory and is » 1.33 in our case (see the Appendix for 
details), a denotes the logarithmic slope of the halo density 
profile at the percolation theory predicted boundary, R^. For 
an NFW density profile, a is given by 



1 + 



2ca 
1 +c^ 



(17) 



The probability p{r) (see Appendix for the connection to per- 
colation theory) at a given radius depends upon the number 
density of particles at that radius, n(r), via 



p{r) - \ - exp<--(/7/) n(r) 



(18) 



and d In Mt^jdp denotes the derivative of the logarithm of the 
mass with respect to p at the percolation threshold predicted 
boundary, /?a- Larger values of Lsize correspond to higher res- 
olution and the mass measured by the FOF algorithm tends 
to Mj^j. asymptotically. Note that our correction formula de- 
pends upon the number of halo particles, A^^, the linking 
length parameter b, and the concentration parameter, ca. 

The circles in Figure |8] show the result of this correction. 
The figure shows that the mass corrected by this formula 
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as this formula was devised to correct resolution bias in the 
halo mass function, rather than mass of individual idealized 
NFW halos. As we show below, other resolution effects affect 
masses of real CDM halos and thereby the halo mass func- 
tion. The presented exercise simply indicates that the formula 
of Warren et al. ( 2006) does not describe the mass bias of ide- 
alized halos considered here. 

Also note that even at infinite resolution the FOF algorithm 
selects a mass which is smaller than Ma by 2%. This is be- 
cause the boundary of FOF halos is not a step function even 
at infinite resolution (see Fig. [U. We defer detailed discus- 
sion of this effect to the Appendix and show that this small 
additional correction can also be calculated from percolation 
theory. The bold circles in Fig.|8]show the result of correcting 
the masses taking into account this additional small effect. As 
the figure shows, the full correction brings the value of the 
FOF halo masses in good agreement with the true mass Ma. 

Figure |9] shows the results of the Monte Carlo realizations 
of spherical NFW halo s of differing concentrations carried out 
by iLukic et aTl (|2009[ shown by squares) and predictions of 
our model (shown by solid lines). These authors applied the 
FOF algorithm with b - 0.2 to identify halos from the real- 
izations and showed that FOF mass of halos depen ds on con- 
centration of their density distribution. [Lukic et al.l(f2;009 ) de- 
fined both the reference halo mass, M200C, and concentration, 
C20{)f, relative to the radius, R2QOC, enclosing overdensity of 
200 times the critical density of the universe. They found that 
FOF mass is generally significantly different than M2ooe and 
the difference depends on C2()0e and the number of particles in 
a halo (effect similar to that discussed above). 

We show our percolation theory-motivated prediction for 
the ratio of the FOF halo masses to M20()c calculated by using 
Eq. [16] and after applying the correction for the boundary of 
the halo as solid lines in Figure |9] The prediction is in ex- 
cellent agreement with the results of Lukic et al. (2009) and 
it accurately captures the dependence of Mfnt /Mg nOf: ratio on 
the co ncentration and particle number found by Lukic et al.l 
( 120091) ■ We would like to note that the correction formula 
presented iLukic et al.l (1200 9) is a numerical fit to their results 
and is only valid for the Unking length parameter, b - 0.2 for 
which they calibrate their correction. The correction based on 
equation [l6l is valid for different values of b, concentrations, 
and values of the numerical resolution (Lsize)- 

In the Appendix, we also test our correction against simu- 
lated halos with varying slopes of the number density profile 
and show that it works remarkably well for different slopes. 
We also show that we are able to ex plain the empiric al results 
for isothermal halos ' ' found by Wa rren et alJ (120061) . 

Given that the density of CDM halos decreases rapidly near 
the outer virialized regions, an overestimate of mass for small 
Lsize and corresponds to an underestimate of the enclosed 
overdensities of FOF halos. This underestimate can be seen 
in the form of downturn of overdensity for halos from ACDM 
simulations observed in Figures|5]and|6] For a fixed mass and 
fixed value of b, the Bolshoi simulation has a larger value of 
Lsize than the MultiDark simulation. This explains why the 
downturn occurs at lower halo masses for the Bolshoi than 
for the MultiDark simulation. It is also clear from Eq.[T6] that 
Lsize «: ^ and therefore the downturn in overdensity shifts 
to smaller masses for decreasing values of b. 



is independent of Lsize- The tr iangles, on the othe r hand, 
show the empirical correction of " Warren et al.l (12006b . which 
clearly fails to correct the effect fuUy. This is not surprising 



" We note that the empirical formula given bv lWarren et al.l (120061) does 
not explain the results of their isothermal halos. 
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5.2. Resolution dependence of the FOF mass for real ACDM 

halos 

In the previous subsection, we showed that the mass of ha- 
los selected by the FOF algorithm depends upon Lsize- The 
mass M selected by FOF at finite Lsize can be larger than M°° 
by as much as 5 - 20% for small values of Lsize- This effect, if 
not corrected for, can potentially introduce systematic errors 
in the determination of the mass function using halos selected 
by FOF. We have also shown that the percolation theory mo- 
tivated formula given by eq. [16] is able to correct this depen- 
dence of the mass on Lsize for spherical NFW halos (or for 
spherical halos with a power law density profile). Real halos, 
however, are not spherical and contain substructure. In this 
section, we therefore test the correction formula derived for 
idealized halos against undersampled versions of real halos 
selected from cosmological simulations. 

For this purpose, we make use of the L IOOOW simulation o f 
size Lb = 1 /z 'Gpc, described in detail in [Tinker et al.l (120081) . 
The simulation follows the evolution of dark matter particles 
in a ACDM cosmology with parameters that are slightly dif- 
ferent from the Bolshoi and the MultiDark simulation: the 
matter density and the baryon density in units of the critical 
density. On, - 0.27 and Qb = 0.044, the Hubble constant 
h = //o/(100kms-i Mpc"') = 0.70, the rms amplitude of lin- 
ear fluctuations in spheres of radius 8/?"' Mpc, erg - 0.79 and 
the power law slope of the initial power spectrum, «j = 0.95. 
We run the FOF algorithm with a linking length parameter 
b - 0.2 on the redshift zero snapshot of the simulation. For 
the purpose of our tests, we focus our attention to the 25 most 
massive halos selected by FOF. 

We selected all particles within a radius Rms.x = 10 /i^'Mpc 
of the center of mass of each of these halos. We have 
verified that all the particles of each halo selected by FOF 
lie well within Rm&x- We created 1000 subsamples each of 
particles around every halo by using only a fraction / e 
{0.2, 0.4, 0.6, 0.8) of the particles. We then mn FOF on each 
of these subsamples using a linking length parameter b - 
0.2/"'^^. We use the symbol yu/ to denote the ratio of the 
mass selected by FOF when run on a subsample with a frac- 
tion / of the original particles to the mass of the FOF halo 
when using all the particles. 

In the left hand panel of Fig. [TOl we show the distribution 
of yUy for different values of / using different line types. Note 
that the peak of the distribution shifts towards larger values 
of yu f for smaller values of /. This is qualitatively similar to 
the behavior of FOF discussed in §|5] However, we also notice 
that the distribution of yu y has a significant tail towards smaller 
values of /iy . In roughly one third of the cases (9 out of 25), 
the FOF algorithm often fails to bridge a structure in the outer 
parts of the halo with the main halo. The effect appears less 
severe because we have plotted the combined distribution of 
yUy values for the 25 halos. However, in the case of halos for 
which bridging is an issue, the distribution of yu y clearly shows 
a bimodal distribution. 

The right hand panel of fig. [10] shows the cumulative dis- 
tribution of /ij. Note that smaller values of / have a slightly 
larger tendency to avoid bridging. This counteracts the ten- 
dency to select larger masses at smaller values of /. If we 
assign a mass for each halo for a given value of / as the aver- 
age of the FOF mass over the 1000 subsamples, we often find 
that this average FOF mass increases as / increases contrary 
to our idealized NFW halos. Clearly using the average is sen- 
sitive to the tails of the distribution. Therefore, we used the 



median of the FOF masses of the 1000 subsamples to test our 
correction formula. 

We denote the median mass selected by the FOF algorithm 
when run on a fraction / of the particles by Mf and the median 
mass after correcting for the finite size effect using eq[T6lbv 
M'^. The top panel of Figure [12] shows the ratio of /MJ°q 
for the 25 most massive halos. Our correction formula, which 
worked extremely well for the idealized spherical NFW halos, 
seems to systematically overcorrect for the finite size effect 
for small values of / by w 3 - 5%. 

The two plausible causes for this behavior are: (i) the non- 
sphericity of real halos, and (ii) the presence of substructure 
in real halos. We carried out another set of Monte-Carlo sim- 
ulations of idealized triaxial halos where the number density 
of particles is given by a NFW-like profile with the radius r 
replaced by ^ such that 

2 2 2 

o X y z 

We used values of ajc - 0.6 and b/c = 0.8, typical for ha- 
los found in numerical simulations of dark matter We have 
verified that the correction formula given by eqJT6]works per- 
fectly well even if our triaxial halos are incorrectly assumed 
to be spherical. Our use of the spherically averaged num- 
ber density distribution to determine the correction does not 
introduce any systematic errors. We also experimented with 
particles whose number density distribution follows a power 
law in radius and found an identical result. 

To investigate the effects of substructure, we carried out the 
following test. We first obtained the SPH estimate of the den- 
sity at the location of all particles in each of the halos using 
128 nearest neighbor particles. We used the position of the 
particle with the largest density as the center of the halo. We 
then randomly reassigned the angular coordinates of each of 
the particles within a 10 /i^'Mpc sphere with respect to the 
center of the halo. In this manner, we were able to disperse the 
substructure over a wider range of angular coordinates while 
still preserving the radially averaged density profile. We then 
repeated our exercise of running FOF on subsampled versions 
of this set of particles. 

We show the results of this exercise in Figure [TT] which 
shows the distribution of values of /ly thus obtained. In con- 
trast to Figure [To] the distribution of yUy is much more sym- 
metric with no significant presence of tails. The peak of the 
distribution occurs at larger values of /iy as / is decreased. 
The lower panel of Figure [12] shows the ratio MJ^/M",, for 
halos where the substructure has been dispersed. Contrary to 
the results in the top panel, in this case our correction for- 
mula corrects masses accurately. This shows that failure of 
the correction formulae derived for idealized halos is due to 
substructure present in real CDM halos simulated with suffi- 
ciently high resolution. 

The results of this exercise show that the masses selected 
by FOF for realistic halos can not be corrected for finite size 
effects in a straightforward manner Although percolation- 
motivated correction formula we derived for halos without 
substructure ( eq. 1761 ) is highly accurate, it cannot be blindly 
applied to correct halo masses selected by the FOF algorithm. 
Substructure introduces strong resolution-dependent effects. 
The amount of substructure depends on resolution of simula- 
tions in a non-trivial way and will vary for halos of different 
mass within a simulation. It will also vary with redshift for 
a given halo mass. This indicates that any empirical formula 
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Fig. 10. — The left hand panel shows the probability distribution of the ratio fif of the mass selected by the FOF algorithm when applied to the 1000 subsamples 
of a fraction / of the particles around the 25 most massive halos to the mass of the halo selected by the FOF algorithm with / = 1.0. The right hand panel shows 
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3 




0.9 0.95 1 1.05 
Mf=Mf/Mi.o 



0.85 0.9 0.95 1 1.05 1.1 
Mf=M,/M,o 



Fig. 11. — Same as Fig. 1101 except when the angular coordinates of the particles around the center of the FOF halo are shuffled to disperse substructure (see 
text for details). 



designed to correct masses of halo mass function for resolu- 
tion effects will also depend in a non-trivial way on resolution, 
cosmology, and redshift. We thus caution against the use of 
empirical formulae that depend just upon the number of par- 
ticles in a halo calibrated for a single cosmology and redshift, 
as these will Ukely be inaccurate for other cosmologies and 
redshifts. 

6. DISCUSSION AND CONCLUSIONS 

In this paper we have explored properties of halos identified 
by the FOF algorithm focusing on the halo boundary. Using 
idealized Monte Carlo realizations of spherical NFW halos we 
showed that boundary of the FOF halos spans a range of lo- 
cal overdensities and is inherently "fuzzy." The fuzziness of 
the boundary increases with decreasing number of halo par- 



ticles. We demonstrate that these results can be interpreted 
in terms of the percolation theory, which we discuss in de- 
tail in the Appendix. The value of characteristic local over- 
density within FOF boundary derived from our Monte Carlo 
realizations and predicted by percolation theory is given by 
(eq.©: ^tw = 0.6529/7"^ - 1, which gives 6^ = 80.61 for 
the commonly used value of b - 0.2. This is significantly 
larger than the local overdensity of x 60 usually assumed for 
this value of linking length. Correspondingly, the enclosed 
overdensity of typical FOF halos is significantly larger than 
180 and ranges from ~ 250 to ~ 600. Specific value of the 
enclosed overdensity is determined by the concentration of 
halo (density distribution) and therefore depends on cosmol- 
ogy, halo mass, and redshift. We predict this dependence us- 
ing a simple analytic model based on NFW density profile 
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and show that this model reproduces results of cosmological 
simulations of ACDM cosmology at different halo masses, 
redshifts, and values of the linking length b. 

For a given linking length b, the range of overdensities (i.e., 
the fuzziness) in the boundary of FOF halos increases with 
decreasing number of halo particles due to changing prop- 
erties of percolation for smaller values of parameter Lsize = 
2R/^/{bl), where 7?a is the effective radius of the FOF bound- 
ary. For a given simulation, this results in a systematic and 
increasing overestimate of the FOF mass with decreasing halo 
mass. This effec t has been fou nd empirically by Warr en et aTl 
(|2006) and Luk ieetaTI dlOOl . 

We demonstrate how it can be understood qualitatively on 
the basis of percolation theory. We also present an accurate 



formula for correcting this systematic FOF mass bias for ide- 
alized halos without substructure. This formula is accurate for 
different values of linking lengths b, halo concentrations, and 
values of parameter Lsize. We note, however, that this accurate 
correction requires knowledge of the halo concentration-mass 
relation, which itself would need to be accurately calibrated 
f or di fferent cosmologies. Moreover, as we demonstrated in 
§ 15.21 substructure in real halos introduces additional substan- 
tial resolution-dependent biases into masses of FOF halos. 
Given that amount of substructure depends on resolution of 
simulations and simulation cosmology and redshift in a non- 
trivial way, any empirical mass correction formula should also 
depend in a non-trivial way on resolution, cosmology, and 
redshift. 
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The concentration and non-trivial resolution dependence of 
enclosed overdensities and masses of the FOF halos make it 
difficult to interpret their raw mass function and its univer- 
sality physically in terms of an underlying model of nonlinear 
collapse. For instance, as we note in §|4] concentration depen- 
dence of FOF overdensity is likely behind smaller deviations 
of the FOF halo mass function from universality, as some of 
the real redshift evolution of the halo mass function is partially 
cancelled by redshift evolution of halo concentrations. Al- 
though such partial cancellation may work for a single ACDM 
cosmology, it will not work in general as halo concentrations 
do depend on cosmological parameters. All this also makes 
it more complicated to connect FOF halo masses to observa- 
tional estimates of masses, which are typically made within 
spherical apertures enclosing a fixed (and fairly high) over- 
density, with concentration of density profile not known a pri- 
ori. 

Neverthless, results oflC ourtin e t al.l(l2010|) do indicate that 
universality of the halo mass function can be improved if cos- 
mology dependence of non-linear virialization is taken into 
account properly in the definition of halo mass. In § H] we 
show that their empirical findings can be understood better in 
terms of our results and model. Further exploration of this 
issue is definitely warranted. Overall, even though interpreta- 
tion of FOF halo statistics is more complicated in light of our 
results, improved understanding of the FOF identified halos 
makes any interpretation more robust. 

Our results should be also useful in constructing mock cat- 
alogs of galaxies based on FOF halo catalogs. To repro- 
duce galaxy clustering properly this procedure requires good 
knowledge of internal overdensity of identified halos. Model 



and percolation theory results presented in this paper can be 
used to accurately estimate this overdensity even for halos 
with small numbers of particles. 
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APPENDIX 

BRIEF REVIEW OF THE RELEVANT ASPECTS OF PERCOLATION THEORY 

Consider a point process that generates a set of points on an A^-dimensional manifold. Percolation theory deals with the 
statistics of clusters (or groups of friends in FOF terminology) formed by grouping together neighboring points on the manifold. 
Traditionally, th e percolat ion problem is defined on a lattice where the occupation of each lattice cell is determined by a random 
process ( Stauff' er & Aharo nv 1994). However, the continuum percolation (Swiss -cheese) model is more relevant to our discussion 
of the FOF algorithm (iRoberts & Storevll968l:lDombll972tlLorenz & Ziffl2001h . In this appendix, we briefly describe this model 
and how the profile of the boundary of a FOF halo can be understood in more detail. 

The Swiss-cheese percolation model considers a set of spheres of equal radius, R, whose centers are distributed by a random 
Poisson process with a constant average number density «(x) in a LxLx L volume, where L » The spheres can be thought 
of as spheres carved in a slab of cheese, from which the model derives its name. Groups of overlapping spheres form clusters of 
varying sizes. The largest cluster that forms in the system is of particular importance, and for a fixed value of R, its size depends 
upon the average number density of spheres in the system. As the number density of spheres is increased, the size of the largest 
cluster increases until at a critical number density the largest cluster size becomes » L. This event is called percolation, the 
smallest number density at which it happens is called the critical percolation threshold and the corresponding cluster is called the 
infinite cluster The critical density, n^. in units of 1/{2R)^ is a universal constant and has been accurately measured by extensive 
Monte-Cai-lo simulations: n,. = 0.652960 + 0.000005 (L orenz & Z iff 2001). 

The linking length of the FOF algorithm, bl, corresponds to the diameter 2R of the spheres in the Swiss-cheese percolation 
model. The centers of overlapping spheres correspond to "friend" particles in the FOF algorithm as the distance between the 
centers is less than the linking length. In the FOF language, the critical density threshold is therefore naii = ndilRf' - ncb^^l'^, 
which corresponds to an overdensity of 5 - n^^^tln - \ - ncb^^ - 1. 

For the Swiss-cheese model, the probability for any given point x in the Lx Lx L volume to belong to a non-zero number of 
spheres is given by 

p(x) = 1 - exp |-^7riJ^n(x)| = 1 - exp |-^7r(27;)^n(x)| . (Al) 

It is conventional to define the percolation problem in terms of this probability instead of the number density n(x), in which case 
the critical threshold for percolation is related to n,. via 



= 1 - exp^-^Hc) ■ (A2) 



Close to the percolation threshold, the probability that any point x belongs to the infinite cluster, f co, also called the strength of 
the infinite cluster, follows the scahng relation 

Poo^ip-Pcf, (A3) 

where y6 is a constant which depends upon the dimensionality of the problem. Only few problems in percolation have exact 
analytical solutions. Hence, the constant (3 has to be determined by Monte-Carlo experiments and it has been found to approxi- 
mately equal to 0.42 for percolation in three dimensions, (see, e.g., Stauffer & Aharony 1994). Another quantity of interest is the 
correlation or the connectivity length, denoted by ^, and defined as the average distance between two points that belong to the 
same cluster. As p approaches pc, ^ follows the scaling relation given by 

^^\p-pr (A4) 

where the constant v again depends upon the dimensionality of the problem and is approximately equal to 0.88 in three dimensions 
and 4/3 in two dimensions. 

How do these basics of the percolation theory relate to the halos identified by the FOF algorithm? In the context of the Monte 
Carlo realizations of spherical NFW halos considered in § |2] the particle distribution of a given realization is a set of points 
distributed in a spherical volume of radius 2R\^(). The FOF algorithm with linking length b applied to these points treats particles 
as a set of spheres of radius R - bl/2. Those particles whose spheres overlap are considered friends. The difference from a simple 
uniform density example considered above is that our halos have non-uniform density distribution. Thus, instead of considering 
percolation in a uniform distribution for diff'erent particle number densities, we are considering percolation as we decrease the 
number density of particles as a function of increasing radius. For a given b, there will be a certain radius at which the critical 
number density for percolation, n^ (and corresponding probability pc) is reached. Particles around this radius will have a high 
probability Poa to be a part of the infinite cluster - i.e., to be joined into FOF halo. It is these particles that form the boundary of 
an FOF halo. Below we consider the properties of this boundary in the context of the percolation theory. 

DETAILED ANALYSIS OF THE FOF BOUNDARY OF NFW HALOS 

In the left panel of Figure[T3] we show the probability p for a point to be within a distance bl/2 from any particle as a function 
of its position x - r/rs for the Monte Carlo realizations of spherical NFW halos analyzed in §|2] In percolation theory, for point 
distributions with non-uniform density the infinite cluster is defined as the cluster connected to spheres that lie in the region where 
the probability p I. In our case, this is equivalent to the group that consists of particles at the center of the halo and is the 
largest group found by the FOF algorithm. 
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Fig. 13. — The probability p as a function of the radius (left panel) and probability to be a part of an infinite cluster, Poo, as a function of p (right panel) for the 
Monte Carlo realizations of spherical NFW halos (c = 10) analyzed in §|2] In the left panel the critical threshold for percolation pc is shown with the horizontal 
dashed line. In the right panel pc is shown by the solid vertical fine; different line types correspond to halo realizations with different numbers of particles, with 
fine types and colors corresponding to the same halos as in Figure[T](from left to right lines correspond to Wigo from 100 to lO' paiticles. 
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Fig. 14. — The left hand panel shows the strength of the infinite cluster, P„, as a function of p - pc for our Monte Carlo realizations of spherical NFW halos. 
Different line types correspond to halos generated with varying numbers of particles. Line types and colors correspond to the same halos as in Figure □ The 
right hand panel shows the strength as a function of /), for the highest resolution halo. The solid red line shows prediction of the percolation theory for a uniform 
distribution of particles. 



We denote the fraction of spheres at any given radius that belong to the infinite cluster by /accept- This fraction is simply the 
ratio of the strength of the infinite cluster to the probability for any point to belong to any sphere: 

./accept — ■ (Bl) 

P 

In the right panel of Figure [13] we show Poo as a function of p for the NFW halo realizations. The line types and colors are the 
same as in Figures [1] to |4] For p » p^, /accept - 1 and p/accept = P- Near the percolation threshold pc, the fraction /accept falls 
steadily from one to zero in a way that depends upon the mean interparticle separation in the halo relative to the linking length. 

We first investigate the strength of the infinite cluster. Pea, for p > p^. In the left panel of Figure[T4] we show the dependence 
of Pco on p - Pc for p > Pc, obtained by analysing the boundary of the NFW halo realizations identified by the FOF The bold 
grey line shows the percolation theory prediction given by eg. IA3l with [5 - 0.43. This prediction is in a very good agreement 
with the results of the Monte Carlo simulations over an order of magnitude in probability p for the realizations with the largest 
number of particles. In the right hand panel, we compare this prediction to the results from the highest resolution halo. We find 
that percolation theory describes the behavior of the FOF boundary for p > Pc quite well. This explains why our empirical results 
for the FOF boundary do not conver ge to a step function. 

Note that the simple scahng of eg. IA3 [ predicts that Poo — » as p — > pc- This scaling, however, is correct strictly for a uniform 
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Fig. 15. — The fraction of particles that are joined by the FOF algorithm (with b = 0.2) into the main halo as a function of the radius in units of Rigo for our 
Monte Carlo realizations of spherical NFW halos. Bold solid line shows the percolation theory prediction for uniform particle density, which can be compared to 
the results of our simulations shown with lines of different style and color Number of particles in each halo realization is indicated in the legend. 
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defines the width of the FOF halo boundary. For halos with Ljize S 10 the FOF algorithm overestimates halo masses by ^ 10% (see Figures Island l 1 7t . 



distribution of particles in an infinite volume. In contrast, realistic hal os cover a finite volume and have significa nt density 
gradients. These efifects change the predictions of percolation theory (e.g. lStauffer & Aharony|[T994l: iRosso et al.l[T986 '). 

For the standard case of percolation in an infinite volume with uniform mean density, the connectivity length ^ (expressed in 
units of the sphere size or linking length) is the only scale in the problem, and near the critical threshold pc, the connectivity 
length ^ exhibits critical scaling behavior, ^ oc \p^. - p\^^. In the more general case, other scales like the system size Lsize or 
local scale length s - p/\^p\ can be important as well. For example, in finite volumes percolation occurs when the connectivity 
length becomes of order the system size, ^ ^ OiL^ae), which occurs at a lower density than infinite percolation. The percolation 
threshold, therefore, decreases as the system size decreases, and we can easily see that setting ^ « Lsize in Eqn. ( IA4b shows that 
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the finite-size threshold scales as (IStaufFer & Aharonvll 19941) 

Similarly, density gradients also modify the percolation transition. Regions where the density is below the naive critical threshold, 
p < Pc, can still be linked to regions above threshold, if the connectivity length is of order the distance to the super-critical region. 
In other words, gradients will smear out the percolation transition, by an amount that is straightforward to estimate. If we Taylor 
expand about the location where p - pc, writing p{x) - pc + iyp)x + . . ., then setting x k ^ shows that the transition is smeared 
by a distance of roughly 

^ \Pc - pm-" = \Pc - Pc - (^pW = ivp ^r" ^ ^ <x \vpri^'^'\ (B3) 

This corresponds to a width (jp in p{x) such that 

cr^, oc |V/7|^oc IVpl'^f'^"'. (B4) 

Thus, for non-uniform distri butions, the density gradient results in a much more gradual transition of Poo to zero, which extends 
to p < Pc (iRosso et al.lll986^ ■ as illustrated in Fig.fl4l 

For realistic halos, both of the above effects (finite size and density gradient) could be significant, but their importance must 
diminish as the particle number in the halo increases. To judge the importance of these effects for finite particle numbers, the 
quantity of interest is Lsize = 2 R^/ibl),^^ where /?a is the threshold radius at which the probability p - pc- In terms of the number 
of particles in a FOP halo, Lsize is given by 

Z.,.. = a^4«"'. (B5) 

{bl) b\47TAI 

The analogous quantity for the gradient scale length will presumably be of the same order as Lsize for typical outer slopes in halos, 
|c/ log p/d log r\ ~ 2-3. 

In Figure [16] we show Lsize as a function of the number of particles, Na, for halo realizations presented in § |2] The FOF 
algorithm with a linking length parameter b - 0.2 selects an overdensity A = 390.49 for these halos with concentration ciso = 10. 
We note that even for A^a ~ 10000, Lsize ~ 10. For such small values of Lsize, the threshold is significantly less than the infinite, 
uniform density threshold, PciL^im) < Pc, meaning that the FOF algorithm joins particles at radii corresponding to p < p^ into 
the main halo. This also leads to an increase in the mass selected by FOF and a corresponding decrease in the overdensity. 

As we saw above, percolation theory predicts that the threshold value for percolation scales with the size of the system (in units 
of the linking length) as pc - Pc <^ ^-H^ ^ (IStaufFer & Aharonvll 19941) . This implies that the mass of halos selected by the FOF 
algorithm will change as a function ot Ljize as 



dM 

AM cc (p^.-p^), 
dp 



dM 



dp 



L-''\ (B6) 

size ^ ^ 



To test this formula, we performed another set of Monte-Carlo realizations of spherical halos. We assumed that the particles 
follow a power law number density profile 

n{r) oc r" . (B7) 

Following I Warren et al.l (120061) . we arbitrarily normalized the halos to have radius and mass equal to unity, M - I and ^ = 1, and 
used a linking length equal to 

, (B8) 

where is the number of particles within R = 1, to identify halos. We generated halos with a e (1 .5, 1 .75, 2.0, 2.25, 2.5, 2.75). 
For each a, we generated 10^ realizations each consisting of 100, 500 and 1250 particles, 100 realizations each consisting of 
10000 and 80000 particles, ten realizations of 6.4 x 10^ particles, two realizations of 6.4 x 10^ and one realization with 10^ 
particles. The value of the radius predicted using eq.|5]for these halos is given by 



1.25 (3 -a) 



-l/a 

(B9) 



Note that Ra R = 1 is the effective radius of the FOF boundary and we used the fact that = 1 in our model in the derivation 
of above equation. The corresponding value of Lsize depends upon a and is given by 



bl U.25/ 11.25(3-0-)/ ^ ^ 



Note that for increasing a, the same number of particle s, th us correspond to a smaller value of Lsize- We would also like to point 
out that the form of the density profile we chose in EqjBT] above requires c < 3 to avoid the divergence in mass at r = 0. This 

The volume of the system enclosed by the boundary Ra is equal to 4/3nR]^ and the number of spheres of radius (bl)/2 that can fit in this volume is equal to 



A 

= which gives L,,e = 2RJ{biy 
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does not imply that our formalism to correct the masses of low resolution halos breaks down for a >= 3. As long as Lsize, dM/dp 
and a are calculated appropriately at the boundary of the percolation threshold, our formalism should work. 

In each panel of Figure[T7] square symbols show the halo mass of the main FOF halo as a function of Lsize for a = 2.0, 2.25, 2.5 
and 2.75. Other values of a give similar results. The mass of the FOF halo asymptotes to its true value as th e number of particles 
with which the halo is sampled is increased. This effect was first identified empirically bv lWarren et al.l (l2006) and triangles show 
their proposed empirical correction. The figure shows, however, that this correction does not account for the entire effect. The 
circles show the FOF masses corrected using eq. IB 6 1 with a proportionality constant of 0.22 a and v = 4/3 



This correction almost entirely eliminates the Lsize dependence of the FOF-identified halo mass. The circles thus represent the 
mass, M'^^. that would be selected by the FOF algorithm if it were run on a realization with infinite number of particles. We note 
that for steeper density profiles (i.e., larger values of a) a larger number of particles is required to converge to M^j. 

As was pointed out in § |5] and is clearly shown in Figure [iTl the mass M"^^ is smaller than the mass enclosed within an 
overdensity A given by Eq. [TO]by a few percent. This is because the boundary profile of the FOF halos is not a step function but 
has a specific shape that can be approximately described by eq |A3l (see Fig. [15). This allows us to calculate an estimate of the 
fraction Mf. /Ma as 



Here the fraction /accept and Poo are given by eqs. IB 1 1 and IA3I respectively. As can be seen in Figure [8] this boundary effect 
correction leads to values of the masses that are very close to true mass M^. 

In this appendix, we have presented a thorough analysis of the boundary of the FOF halos in the context of percolation theory. 
We have shown that percolation theory accurately predicts the shape of the boundary of the FOF halos close to the density 
threshold for percolation, at least for halos without significant amounts of substructure (see § |5]l. We have also discussed how 
the finite number of particles with which a halo is sampled affects this boundary and have found a percolation theory motivated 
formula to correct for this dependence. Finally, we have also shown how the fraction of mass identified by FOF in an infinite 
resolution halo relates to the mass within a spherical overdensity given by eq. [TO] These results provide a basis and theoretical 
interpretation for the empirical results presented in the main text of the paper. 




(BID 




(B12) 



We have verified with simple three dimensional gradient percolation experiments similar to lRosso et al.l fT983) that v = 4/3 in contrast to v = 0.88 found 
for three dimensions in case of uniform continuum percolation experiments. 
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Fig. 17. — The mass of the FOF halos characterized by different Ljize for halos with power law density profiles n(r) oc r^". Different panels coiTespond to 
different logarithmic slopes a, as indicated in the legends. Squares show the mass selected by the FOF algorithm ran on Monte Carlo realizations of h alos, w hile 
triangles show masses corrected using empirical correction of iWarren et al.i t2006l) . Open circles correspond to the FOF masses corrected using Eq. lBllI The 
horizontal solid lines show the true mass Ma for each halo model. 



